构建代表原子构型的有效描述符对于开发出色的机器学习电位至关重要。广泛使用的常规描述符基于原子分布的两种或三体相关性。最近,揭示了这些多体描述符在分类不同配置时的几个局限性,这对物理特性的预测产生了不利影响。我们根据持续的同源性提出了一类新的描述符。我们专注于持续同源性的二维可视化,即持久图,作为图像形式的原子配置的描述。我们证明,基于该描述符的卷积神经网络模型在预测无定形石墨烯和无定形碳的平均能量方面提供了足够的准确性。我们的结果为使用描述拓扑和几何信息的描述符提供了改善机器学习潜力的途径。
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译